Simulation and Optimization¶
We’ll go through the entire simulation and optimization process in this tutorial. The general procedure on optimizing the portfolio is as follows:
Determine the asset classes (and thus the dataset used)
Simulate the returns
Run an optimization program on the simulated cube (tensor)
More details on each section will be covered.
Getting Started¶
To start off, install the required packages by running the following commands in your command prompt.
Feel free to name my_env
with whatever you like.
conda config --prepend channels conda-forge
conda config --append channels bashtage
conda config --append channels danielbok
conda create -y -n my_env python=3.7 muarch copulae allopy pandas
# you may need to init if you're using conda for the first time
conda init --all
conda activate my_env
Remember to activate the environment.
Simulation¶
For the simuluation The simulation follows the following procedure:
Load the returns dataset of the asset classes you want to simulate
For each asset class, specify the AR-GARCH model
Fit the AR-GARCH models with the log-returns data
Fit the standardized residuals of the AR-GARCH models to a Student Copula
Overwrite the correlation matrix of the Student Copula with the log-returns correlation
Simulate future returns using the AR-GARCH models with distributions “inversed” from the copula
Truncate then calibrate the first 2 moments (returns and vol) of the cube
Exploring the Data¶
Let’s start by loading and exploring the sample policy index dataset. These are the monthly returns for the 7 assets we will simulate. The data ranges from 01 Jan 1985 to 01 Oct 2017.
[1]:
import numpy as np
from allopy.datasets import load_index
returns = load_index()
log_returns = (returns + 1).apply(np.log)
_, num_assets = log_returns.shape
log_returns.head()
[1]:
CASH | NB | EILB | DMEQ | EMEQ | PE | RE | |
---|---|---|---|---|---|---|---|
DATE | |||||||
1985-01-01 | 0.004109 | 0.008796 | 0.028741 | 0.058068 | -0.014185 | 0.067508 | -0.000949 |
1985-01-02 | 0.024052 | -0.002144 | -0.052592 | 0.008257 | -0.041216 | 0.007606 | -0.018895 |
1985-01-03 | -0.030244 | 0.006433 | -0.000043 | -0.013792 | 0.017271 | 0.028742 | -0.014230 |
1985-01-04 | 0.002630 | 0.005479 | 0.020288 | -0.083561 | -0.005833 | -0.016684 | -0.022006 |
1985-01-05 | 0.007554 | 0.042372 | 0.046989 | 0.052969 | -0.008937 | 0.071804 | 0.035114 |
[2]:
log_returns.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 394 entries, 1985-01-01 to 2017-01-10
Data columns (total 7 columns):
CASH 394 non-null float64
NB 394 non-null float64
EILB 394 non-null float64
DMEQ 394 non-null float64
EMEQ 394 non-null float64
PE 394 non-null float64
RE 394 non-null float64
dtypes: float64(7)
memory usage: 24.6 KB
Forming the Models¶
By right, we are supposed to form a distinct AR-GARCH model for each asset class. However, we’ll take some short cuts by assuming every model is an AR(1)-GARCH(1,1)-Skew_T model with the exception of CASH. This exception is meant to serve as an example of how to change each model separately.
[3]:
from muarch import MUArch, UArch
# The settings defined will set the default for all models.
arch_models = MUArch(num_assets,
mean='ARX',
lags=1,
vol='GARCH',
p=1,
q=1,
power=2.0,
dist='skewt',
scale=100)
# You can specify each model separately. We'll use CASH as an example
# CASH is the model in the MUArch object
arch_models[0] = UArch(mean='CONSTANT',
lags=0,
vol='CONSTANT',
dist='skewt',
scale=100)
Fitting the Model¶
We can call the fit function and subsequently check the quality of each fit through the .summary()
method.
[4]:
arch_models.fit(log_returns)
arch_models.summary()
[4]:
CASH
Dep. Variable: | y | R-squared: | -0.000 |
---|---|---|---|
Mean Model: | Constant Mean | Adj. R-squared: | -0.000 |
Vol Model: | Constant Variance | Log-Likelihood: | -734.606 |
Distribution: | Standardized Skew Student's t | AIC: | 1477.21 |
Method: | Maximum Likelihood | BIC: | 1493.12 |
No. Observations: | 394 | ||
Date: | Sun, Jan 12 2020 | Df Residuals: | 390 |
Time: | 11:54:38 | Df Model: | 4 |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
mu | -4.6322e-03 | 8.148e-02 | -5.685e-02 | 0.955 | [ -0.164, 0.155] |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
sigma2 | 2.6251 | 0.274 | 9.575 | 1.019e-21 | [ 2.088, 3.162] |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
nu | 5.4964 | 1.387 | 3.964 | 7.368e-05 | [ 2.779, 8.214] |
lambda | -6.7665e-04 | 6.797e-02 | -9.955e-03 | 0.992 | [ -0.134, 0.133] |
Covariance estimator: robust
NB
Dep. Variable: | y | R-squared: | 0.009 |
---|---|---|---|
Mean Model: | AR | Adj. R-squared: | 0.006 |
Vol Model: | GARCH | Log-Likelihood: | -651.331 |
Distribution: | Standardized Skew Student's t | AIC: | 1316.66 |
Method: | Maximum Likelihood | BIC: | 1344.48 |
No. Observations: | 393 | ||
Date: | Sun, Jan 12 2020 | Df Residuals: | 386 |
Time: | 11:54:38 | Df Model: | 7 |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
Const | 0.1950 | 7.099e-02 | 2.746 | 6.029e-03 | [5.582e-02, 0.334] |
y[1] | 0.0896 | 5.778e-02 | 1.550 | 0.121 | [-2.368e-02, 0.203] |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
omega | 0.0415 | 8.911e-02 | 0.465 | 0.642 | [ -0.133, 0.216] |
alpha[1] | 0.0000 | 4.415e-02 | 0.000 | 1.000 | [-8.653e-02,8.653e-02] |
beta[1] | 0.9723 | 0.100 | 9.698 | 3.080e-22 | [ 0.776, 1.169] |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
nu | 8.2346 | 3.400 | 2.422 | 1.543e-02 | [ 1.571, 14.898] |
lambda | -0.0813 | 0.105 | -0.774 | 0.439 | [ -0.287, 0.125] |
Covariance estimator: robust
EILB
Dep. Variable: | y | R-squared: | 0.002 |
---|---|---|---|
Mean Model: | AR | Adj. R-squared: | -0.001 |
Vol Model: | GARCH | Log-Likelihood: | -1024.23 |
Distribution: | Standardized Skew Student's t | AIC: | 2062.46 |
Method: | Maximum Likelihood | BIC: | 2090.27 |
No. Observations: | 393 | ||
Date: | Sun, Jan 12 2020 | Df Residuals: | 386 |
Time: | 11:54:39 | Df Model: | 7 |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
Const | 0.7238 | 0.177 | 4.081 | 4.477e-05 | [ 0.376, 1.071] |
y[1] | 0.0202 | 5.402e-02 | 0.375 | 0.708 | [-8.565e-02, 0.126] |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
omega | 7.2805 | 1.601 | 4.549 | 5.402e-06 | [ 4.143, 10.418] |
alpha[1] | 0.1777 | 0.109 | 1.634 | 0.102 | [-3.549e-02, 0.391] |
beta[1] | 0.1816 | 0.153 | 1.187 | 0.235 | [ -0.118, 0.481] |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
nu | 8.8789 | 3.717 | 2.389 | 1.691e-02 | [ 1.594, 16.164] |
lambda | -0.0437 | 6.861e-02 | -0.636 | 0.524 | [ -0.178,9.080e-02] |
Covariance estimator: robust
DMEQ
Dep. Variable: | y | R-squared: | -0.004 |
---|---|---|---|
Mean Model: | AR | Adj. R-squared: | -0.006 |
Vol Model: | GARCH | Log-Likelihood: | -1142.18 |
Distribution: | Standardized Skew Student's t | AIC: | 2298.36 |
Method: | Maximum Likelihood | BIC: | 2326.18 |
No. Observations: | 393 | ||
Date: | Sun, Jan 12 2020 | Df Residuals: | 386 |
Time: | 11:54:39 | Df Model: | 7 |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
Const | 0.3412 | 0.219 | 1.558 | 0.119 | [-8.804e-02, 0.770] |
y[1] | -0.0145 | 5.746e-02 | -0.252 | 0.801 | [ -0.127,9.814e-02] |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
omega | 1.3369 | 0.568 | 2.354 | 1.860e-02 | [ 0.224, 2.450] |
alpha[1] | 0.0762 | 3.183e-02 | 2.393 | 1.672e-02 | [1.378e-02, 0.139] |
beta[1] | 0.8594 | 4.086e-02 | 21.032 | 3.314e-98 | [ 0.779, 0.939] |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
nu | 9.6054 | 4.727 | 2.032 | 4.215e-02 | [ 0.341, 18.870] |
lambda | -0.2163 | 7.351e-02 | -2.943 | 3.255e-03 | [ -0.360,-7.224e-02] |
Covariance estimator: robust
EMEQ
Dep. Variable: | y | R-squared: | 0.019 |
---|---|---|---|
Mean Model: | AR | Adj. R-squared: | 0.016 |
Vol Model: | GARCH | Log-Likelihood: | -1309.97 |
Distribution: | Standardized Skew Student's t | AIC: | 2633.94 |
Method: | Maximum Likelihood | BIC: | 2661.75 |
No. Observations: | 393 | ||
Date: | Sun, Jan 12 2020 | Df Residuals: | 386 |
Time: | 11:54:39 | Df Model: | 7 |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
Const | 0.9325 | 0.357 | 2.609 | 9.080e-03 | [ 0.232, 1.633] |
y[1] | 0.1289 | 4.625e-02 | 2.787 | 5.317e-03 | [3.826e-02, 0.220] |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
omega | 1.6863 | 1.348 | 1.251 | 0.211 | [ -0.956, 4.329] |
alpha[1] | 0.0322 | 1.742e-02 | 1.846 | 6.490e-02 | [-1.985e-03,6.629e-02] |
beta[1] | 0.9367 | 3.036e-02 | 30.856 | 4.717e-209 | [ 0.877, 0.996] |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
nu | 6.0618 | 1.870 | 3.242 | 1.186e-03 | [ 2.397, 9.726] |
lambda | -0.2297 | 7.166e-02 | -3.205 | 1.349e-03 | [ -0.370,-8.925e-02] |
Covariance estimator: robust
PE
Dep. Variable: | y | R-squared: | 0.028 |
---|---|---|---|
Mean Model: | AR | Adj. R-squared: | 0.026 |
Vol Model: | GARCH | Log-Likelihood: | -1180.08 |
Distribution: | Standardized Skew Student's t | AIC: | 2374.16 |
Method: | Maximum Likelihood | BIC: | 2401.98 |
No. Observations: | 393 | ||
Date: | Sun, Jan 12 2020 | Df Residuals: | 386 |
Time: | 11:54:39 | Df Model: | 7 |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
Const | 0.4321 | 0.246 | 1.760 | 7.845e-02 | [-4.916e-02, 0.913] |
y[1] | 0.1118 | 5.112e-02 | 2.187 | 2.872e-02 | [1.162e-02, 0.212] |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
omega | 0.9024 | 0.581 | 1.552 | 0.121 | [ -0.237, 2.042] |
alpha[1] | 0.0562 | 1.982e-02 | 2.834 | 4.595e-03 | [1.733e-02,9.502e-02] |
beta[1] | 0.9098 | 2.973e-02 | 30.603 | 1.105e-205 | [ 0.852, 0.968] |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
nu | 8.5291 | 3.344 | 2.550 | 1.076e-02 | [ 1.974, 15.084] |
lambda | -0.2765 | 8.300e-02 | -3.332 | 8.626e-04 | [ -0.439, -0.114] |
Covariance estimator: robust
RE
Dep. Variable: | y | R-squared: | 0.010 |
---|---|---|---|
Mean Model: | AR | Adj. R-squared: | 0.007 |
Vol Model: | GARCH | Log-Likelihood: | -891.715 |
Distribution: | Standardized Skew Student's t | AIC: | 1797.43 |
Method: | Maximum Likelihood | BIC: | 1825.25 |
No. Observations: | 393 | ||
Date: | Sun, Jan 12 2020 | Df Residuals: | 386 |
Time: | 11:54:39 | Df Model: | 7 |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
Const | 0.1584 | 0.127 | 1.251 | 0.211 | [-8.968e-02, 0.407] |
y[1] | 0.0813 | 5.794e-02 | 1.404 | 0.160 | [-3.221e-02, 0.195] |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
omega | 1.0374 | 0.783 | 1.325 | 0.185 | [ -0.497, 2.572] |
alpha[1] | 0.0937 | 4.863e-02 | 1.926 | 5.409e-02 | [-1.645e-03, 0.189] |
beta[1] | 0.7221 | 0.164 | 4.391 | 1.129e-05 | [ 0.400, 1.044] |
coef | std err | t | P>|t| | 95.0% Conf. Int. | |
---|---|---|---|---|---|
nu | 15.0314 | 8.833 | 1.702 | 8.880e-02 | [ -2.281, 32.344] |
lambda | -6.9447e-03 | 8.766e-02 | -7.922e-02 | 0.937 | [ -0.179, 0.165] |
Covariance estimator: robust
Fitting the Copula¶
The next thing we need to do is to fit the copula with the standardized residuals of the arch models’ fits. We can also check the fit of the copula with the .summary()
method.
[5]:
from copulae import TCopula
residuals = arch_models.residuals() # defaults to standardized residuals
cop = TCopula(num_assets)
cop.fit(residuals)
cop.summary()
[5]:
Student Copula Summary
Parameters
Degree of Freedom23.07912545920342Correlation Matrix
1.000000 | 0.132323 | -0.383796 | 0.088988 | 0.095665 | 0.069652 | 0.233252 |
0.132323 | 1.000000 | 0.399767 | -0.019967 | -0.072441 | -0.048850 | 0.178103 |
-0.383796 | 0.399767 | 1.000000 | 0.014854 | 0.038000 | 0.070871 | 0.015642 |
0.088988 | -0.019967 | 0.014854 | 1.000000 | 0.506254 | 0.699188 | 0.467597 |
0.095665 | -0.072441 | 0.038000 | 0.506254 | 1.000000 | 0.579042 | 0.317568 |
0.069652 | -0.048850 | 0.070871 | 0.699188 | 0.579042 | 1.000000 | 0.442605 |
0.233252 | 0.178103 | 0.015642 | 0.467597 | 0.317568 | 0.442605 | 1.000000 |
Fit Summary
Fit Summary | |
---|---|
Log Likelihood | -388.78612323077846 |
Variance Estimate | Not Implemented Yet |
Method | Maximum pseudo-likelihood |
Data Points | 393 |
Optimization Setup | Results | ||
---|---|---|---|
bounds | [(0.0, inf), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0), (-1.0, 1.0)] | x | [ 2.30791255e+01 1.32323371e-01 -3.83796066e-01 8.89879525e-02 9.56646977e-02 6.96515171e-02 2.33251735e-01 3.99767177e-01 -1.99674055e-02 -7.24414193e-02 -4.88504786e-02 1.78102666e-01 1.48544670e-02 3.79998706e-02 7.08714854e-02 1.56419634e-02 5.06253610e-01 6.99188239e-01 4.67596725e-01 5.79042392e-01 3.17568429e-01 4.42605381e-01] |
options | {'maxiter': 20000, 'ftol': 1e-06, 'iprint': 1, 'disp': False, 'eps': 1.5e-08} | fun | -388.78612323077846 |
method | SLSQP | jac | [-0.00053054 -0.03081292 0.02364686 0.04192392 -0.02807307 -0.03360583 0.00718501 -0.03806235 -0.05757859 0.0176783 0.03008912 -0.00658247 0.07304 -0.01542351 -0.05668426 0.00757912 -0.00902673 0.01783368 0.01313841 -0.02499595 -0.02411677 0.02380602] |
None | None | nit | 35 |
None | None | nfev | 907 |
None | None | njev | 35 |
None | None | status | 0 |
None | None | message | Optimization terminated successfully. |
None | None | success | True |
All elliptical copula comes with a correlation matrix. In this case, we would like to overwrite the correlation matrix of the TCopula
with the correlation matrix of the log returns.
[6]:
np.set_printoptions(linewidth=160)
cop.sigma
[6]:
array([[ 1. , 0.13232337, -0.38379607, 0.08898795, 0.0956647 , 0.06965152, 0.23325173],
[ 0.13232337, 1. , 0.39976718, -0.01996741, -0.07244142, -0.04885048, 0.17810267],
[-0.38379607, 0.39976718, 1. , 0.01485447, 0.03799987, 0.07087149, 0.01564196],
[ 0.08898795, -0.01996741, 0.01485447, 1. , 0.50625361, 0.69918824, 0.46759673],
[ 0.0956647 , -0.07244142, 0.03799987, 0.50625361, 1. , 0.57904239, 0.31756843],
[ 0.06965152, -0.04885048, 0.07087149, 0.69918824, 0.57904239, 1. , 0.44260538],
[ 0.23325173, 0.17810267, 0.01564196, 0.46759673, 0.31756843, 0.44260538, 1. ]])
[7]:
# this is how you overwrite the correlation matrix
cop[:] = log_returns.corr()
cop.sigma
[7]:
array([[ 1. , 0.13569432, -0.35933974, 0.10954112, 0.1281724 , 0.09676939, 0.24841052],
[ 0.13569432, 1. , 0.34423657, -0.01144135, -0.11482461, -0.06096223, 0.17947296],
[-0.35933974, 0.34423657, 1. , 0.0869657 , 0.05494202, 0.11039716, 0.03943829],
[ 0.10954112, -0.01144135, 0.0869657 , 1. , 0.59790511, 0.76230971, 0.49290886],
[ 0.1281724 , -0.11482461, 0.05494202, 0.59790511, 1. , 0.67644088, 0.3457239 ],
[ 0.09676939, -0.06096223, 0.11039716, 0.76230971, 0.67644088, 1. , 0.43281785],
[ 0.24841052, 0.17947296, 0.03943829, 0.49290886, 0.3457239 , 0.43281785, 1. ]])
Simulating Future returns¶
Now that we have the copula to generate a dependency structure for each of the univariate GARCH models, we can start simulating future returns. The simulated returns data will be a 3-dimensional cube with axis time, trials and asset class respectively.
For 10000 trials and 120 periods (10 years), this will usually take some time, so give it a while. For demonstration purposes, we’ll be simulating a (120, 1000, 7) cube.
Note that the order is very important in the subsequent optimization procedure, so don’t reshape it!
[8]:
simulated_log_returns = arch_models.simulate_mc(120, 1000, custom_dist=cop.random)
cube = np.exp(simulated_log_returns) - 1 # recover as returns, instead of log returns
Removing Outliers¶
For each asset class, we define outliers as a data point that is more than 6 standard deviations away from the asset class’s mean. We replace these outliers with the asset class mean.
As an aside, there probably is a better approach for multi-variate outlier detection and replacement
[9]:
from muarch.calibrate import truncate_outliers
cube = truncate_outliers(cube, 6)
Calibrating the Moments of the Cube¶
After removing outliers in the cube, the next step is to calibrate the first 2 moments of the cube according to the forward looking assumptions that you have. The listing below shows the function to calculate the annualized mean and volatility for the data cube.
[10]:
import pandas as pd
def show_moments(data):
t, n, a = data.shape
df = pd.DataFrame({
'Asset': returns.columns,
'Mean (%)': ((cube + 1).prod(0) ** 0.1).mean(0) - 1,
'Vol (%)': ((cube + 1).reshape(t // 12, 12, n, a).prod(1) - 1).std(0).mean(0)
})
df.iloc[:, 1:] = df.iloc[:, 1:].apply(lambda x: round(100 * x, 4))
return df
[11]:
# before calibration
show_moments(cube)
[11]:
Asset | Mean (%) | Vol (%) | |
---|---|---|---|
0 | CASH | -0.0091 | 5.0712 |
1 | NB | 2.6050 | 4.2896 |
2 | EILB | 9.3393 | 11.9668 |
3 | DMEQ | 4.2744 | 14.8196 |
4 | EMEQ | 14.3037 | 30.2308 |
5 | PE | 6.5389 | 19.0356 |
6 | RE | 2.1290 | 8.2400 |
For expedience, the calibration code is shown below.
from scipy import optimize as opt
def calibrate_data(data,
mean = None,
sd = None,
time_unit=12):
data = data.copy()
y, n, num_assets = data.shape
y //= time_unit
def set_target(target_values, typ, default):
if target_values is None:
return default, [0 if typ == 'mean' else 1] * num_assets
best_guess = []
target_values = np.asarray(target_values)
assert num_assets == len(target_values) == len(
default), "vector length must be equal to number of assets in data cube"
for i, v in enumerate(target_values):
if v is None:
target_values[i] = default[i]
best_guess.append(0 if typ == 'mean' else 1)
else:
best_guess.append(target_values[i] - default[i] if typ == 'mean' else default[i] / target_values[i])
return target_values, best_guess
d = (data + 1).prod(0)
default_mean = (np.sign(d) * np.abs(d) ** (1 / y)).mean(0) - 1
default_vol = ((data + 1).reshape(y, time_unit, n, num_assets).prod(1) - 1).std(1).mean(0)
target_means, guess_mean = set_target(mean, 'mean', default_mean)
target_vols, guess_vol = set_target(sd, 'vol', default_vol)
sol = np.asarray([opt.root(
fun=_asset_moments,
x0=[gv, gm],
args=(data[..., i], tv, tm, time_unit)
).x for i, tv, tm, gv, gm in zip(range(num_assets), target_vols, target_means, guess_vol, guess_mean)])
for i in range(num_assets):
if sol[i, 0] < 0 or False:
# negative vol adjustments would alter the correlation between asset classes (flip signs)
# in such instances, we fall back to using the a simple affine transform where
# R' = (tv/cv) * r # adjust vol first
# R' = (tm - mean(R')) # adjust mean
# adjust vol
tv = default_vol[i]
cv = sd[i] if sd[i] is not None else tv
data[..., i] *= tv / cv # tv / cv
# adjust mean
tm, cm = target_means[i], data[..., i].mean()
data[..., i] += (tm - cm) # (tm - mean(R'))
else:
data[..., i] = data[..., i] * sol[i, 0] + sol[i, 1]
return data
def _asset_moments(x: np.ndarray, asset: np.ndarray, t_vol: float, t_mean: float, time_unit: int):
t, n = asset.shape # time step (month), trials
y = t // time_unit
calibrated = asset * x[0] + x[1]
d = (calibrated + 1).prod(0)
mean = (np.sign(d) * np.abs(d) ** (1 / y)).mean() - t_mean - 1
vol = ((calibrated.reshape((y, time_unit, n)) + 1).prod(1) - 1).std(1).mean(0) - t_vol
return vol, mean
[12]:
# calibration
from muarch.calibrate import calibrate_data
target_mean = [0, 0.03, 0.06, 0.08, 0.11, 0.12, 0.05]
target_vol = [0.055, 0.073, 0.14, 0.09, 0.22, 0.18, 0.1]
cube = calibrate_data(cube, target_mean, target_vol)
show_moments(cube)
[12]:
Asset | Mean (%) | Vol (%) | |
---|---|---|---|
0 | CASH | 0.0 | 5.0818 |
1 | NB | 3.0 | 6.7131 |
2 | EILB | 6.0 | 12.8631 |
3 | DMEQ | 8.0 | 8.1705 |
4 | EMEQ | 11.0 | 20.1085 |
5 | PE | 12.0 | 16.1869 |
6 | RE | 5.0 | 9.1215 |
Optimization¶
The data used for the optimization is the truncated and calibrated cube from the simulation step before. We have 2 optimizers, the BaseOptimizer
and the PortfolioOptimizer
.
The PortfolioOptimizer
is built on top of the BaseOptimizer
. So if you need to do common optimization operations, use the PortfolioOptimizer
as it probably already has the routines. If you need to work from scratch on something custom, use the BaseOptimizer
.
We’ll explore the BaseOptimizer
first to understand how to code out a Maximize Expected Returns subject to CVaR and Volatility constraints.
An important thing to note is that while we simulated monthly data, for optimization, we use quarterly data. Here’s a fast way to convert monthly to quarterly data.
[13]:
t, n, _ = cube.shape
q_cube = (cube + 1).reshape(t // 3, 3, n, num_assets).prod(1) - 1
Let’s start with the BaseOptimizer
[14]:
from allopy import BaseOptimizer
opt = BaseOptimizer(num_assets)
bounds = np.array([
(0, 0.05),
(0, 0.4),
(0, 0.1),
(0, 0.35),
(0, 0.2),
(0, 0.15),
(0, 0.15),
])
opt.set_bounds(bounds[:, 0], bounds[:, 1])
max_vol = 0.1
max_cvar = -0.33
def obj_fun(w):
# Maximize returns assuming there's rebalancing every quarter
return ((q_cube @ w) + 1).prod(0).mean() - 1
def vol_c_fun(w):
years = len(q_cube) // 4
annual_data = (q_cube @ w + 1).reshape(years, 4, n).prod(1) - 1
annual_vols = q_cube.std(0)
return annual_vols.mean() - max_vol
def cvar_c_fun(w):
# cvar usually calculated for 3 years
returns = ((q_cube[:3 * 12] @ w) + 1).prod(0) - 1
cutoff = np.percentile(returns, 5) # get the 5%
cvar = returns[returns <= cutoff].mean()
return max_cvar - cvar
opt.add_equality_constraint(lambda w: sum(w) - 1)
opt.set_max_objective(obj_fun)
opt.add_inequality_constraint(vol_c_fun)
opt.add_inequality_constraint(cvar_c_fun)
weights = opt.optimize()
bopt = pd.DataFrame({
'Asset': returns.columns,
'Weights': np.round(weights * 100, 2)
})
bopt
[14]:
Asset | Weights | |
---|---|---|
0 | CASH | 0.0 |
1 | NB | 5.0 |
2 | EILB | 10.0 |
3 | DMEQ | 35.0 |
4 | EMEQ | 20.0 |
5 | PE | 15.0 |
6 | RE | 15.0 |
Using the PortfolioOptimizer
.
[15]:
from allopy import PortfolioOptimizer, OptData
main_data = OptData(q_cube, time_unit='quarterly')
opt = PortfolioOptimizer(main_data, cvar_data=main_data[:12])
opt.set_bounds(bounds[:, 0], bounds[:, 1])
weights = opt.maximize_returns(max_vol=max_vol, max_cvar=max_cvar)
aopt = pd.DataFrame({
'Asset': returns.columns,
'Weights': np.round(weights * 100, 2)
})
aopt
[15]:
Asset | Weights | |
---|---|---|
0 | CASH | 0.00 |
1 | NB | 5.34 |
2 | EILB | 10.00 |
3 | DMEQ | 35.00 |
4 | EMEQ | 20.00 |
5 | PE | 15.00 |
6 | RE | 14.66 |